A Hybrid Parallel SOM Algorithm for Large Maps in Data-Mining

نویسندگان

  • Bruno Silva
  • Nuno Marques
چکیده

We propose a method for a parallel implementation of the Self-Organizing Map (SOM) algorithm, widely used in data-mining. We call this method Hybrid in the sense that it combines the advantages of the common network-partition and data-partition approaches, and is particularly effective when dealing with large maps. Based on the fact that a global topological ordering of the map is achieved in a short period of time, the proposed method obtains this ordering during the initial epochs using the Batch Data-Partition algorithm. Here we calculate the input data histogram over the map, based on which the map is segmented and the respective input vectors redistributed equally. From now on, until new segmentation each node only processes their subset of samples in their region of the map. Our experimental results show an average speed-up of 1.27 compared to the classical Batch data-partition method, while maintaining the topological information of the maps.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Electrofacies clustering and a hybrid intelligent based method for porosity and permeability prediction in the South Pars Gas Field, Persian Gulf

This paper proposes a two-step approach for characterizing the reservoir properties of the world’s largest non-associated gas reservoir. This approach integrates geological and petrophysical data and compares them with the field performance analysis to achieve a practical electrofacies clustering. Porosity and permeability prediction is done on the basis of linear functions, succeeding the elec...

متن کامل

An automatic tool to analyze and cluster macromolecular conformations based on self-organizing maps

MOTIVATION Sampling the conformational space of biological macromolecules generates large sets of data with considerable complexity. Data-mining techniques, such as clustering, can extract meaningful information. Among them, the self-organizing maps (SOMs) algorithm has shown great promise; in particular since its computation time rises only linearly with the size of the data set. Whereas SOMs ...

متن کامل

Considering Topology in the Clustering of Self-organizing Maps

– The Self-Organizing Map (SOM) [1] is an effective tool for clustering and data mining. One way to extract cluster structure from a trained SOM is by clustering its weights, which has great potential for automation. This potential is not fully realized by existing algorithms, and leaves large, high-dimensional, complex data to semi-manual treatment. Our main contribution is the exploitation of...

متن کامل

Hybrid artificial immune system and simulated annealing algorithms for solving hybrid JIT flow shop with parallel batches and machine eligibility

This research deals with a hybrid flow shop scheduling problem with parallel batching, machine eligibility, unrelated parallel machine, and different release dates to minimize the sum of the total weighted earliness and tardiness (ET) penalties. In parallel batching situation, it is supposed that number of machine in some stages are able to perform a certain number of jobs simultaneously. First...

متن کامل

Hybrid algorithms for Job shop Scheduling Problem with Lot streaming and A Parallel Assembly Stage

In this paper, a Job shop scheduling problem with a parallel assembly stage and Lot Streaming (LS) is considered for the first time in both machining and assembly stages. Lot Streaming technique is a process of splitting jobs into smaller sub-jobs such that successive operations can be overlapped. Hence, to solve job shop scheduling problem with a parallel assembly stage and lot streaming, deci...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007